PC World 2000 February

home *** CD-ROM | disk | FTP | other *** search

/ PC World 2000 February / PCWorld_2000-02_cd.bin / Software / Servis / FFE / ANIM.SWG / 0015_DVI Multimedia File Format.pas < prev next >

Wrap

Pascal/Delphi Source File | 1997-05-11 | 62.5 KB | 1,780 lines

Appendix F DVI_ Multimedia File Format This appendix describes the DVI_ multimedia file format used for motion video and audio objects, and for storing compressed and uncompressed still images. This appendix provides information about: * File structure and version control strategy * File organization, including tables that describe fields, their settings, and detailed comments, as appropriate * Derived values In addition, because of its compatibility with files produced using the ActionMedia_ II software, the old still image format is documented in this appendix. Introduction DVI technology has defined a file format for storing audio/video objects. Applications should use this and other industry standard file formats, to increase interoperability with other applications such as media editing and manipulation tools. The DVI multimedia file format is particularly appropriate for motion video objects that use the compression algorithms, and media objects that use ActionMedia II board pixel formats. Although this appendix references the routines and concepts used with AVK, the file format itself is not restricted for use only with AVK. The DVI multimedia file format was designed to grow into a general purpose repository for complex multimedia objects, including information that might be added by media object editors. Therefore, it has some reserved fields that are not needed for playback of existing files. The AVK specification does not require the use of the DVI multimedia file format. However, AVK only supports the data streaming file conventions used in DVI multimedia files. Applications can use other data streaming conventions, by converting the data before passing it to and from group buffers. The AVKIO sample programs in the AVK software release provide examples of programs that read and write files in this format. General Considerations Each audio or video file contains one or more streams of data. The following information applies to streams and stream data. * Each stream typically contains digital data that describes a single audio or video stream. For example, an audio stream can contain ADPCM4 encoded data describing a waveform audio channel. Generally, there can be several such streams, all of which are intended for simultaneous playback. * To reduce head movement on the storage device, the data from the various streams must be interleaved. A frame is the unit of interleaving, and is nominally 1/30 of a second. The actual duration of a frame is stored within the file header. The following sections specify and describe recommended field values to use under AVK. Some existing ActionMedia II software files might use values different from those recommended in this appendix. The descriptions list fields, specify how these fields should be referenced when playing a file, and describe fields that can be ignored. To explain how fields should be referenced, the following criteria are used: * Some fields are used only to verify that the file conforms to a variation of the file structure that can be played by AVK. * Some fields are used to access information that must be passed to AVK through appropriate API routines. * Some fields can be safely ignored. File Structure A file consists of a set of inter-related data structures that describe the organization of the data into streams, the nature of the data in each stream, and the actual data itself. The following information describes the various fields in these data structures, and how to generate and interpret the data they contain. Some of these fields are shown as bytes, words, or longs (U8, I16, U16, I32 or U32). These are standard AVK data types defined in AVKCOM.H. In the context of file interchange it is especially important that the precise length of integer fields be specified unambiguously. In many cases, this appendix gives both the symbolic names that are defined in .H include files provided with the AVK product, and the current values associated with these symbols. The data structures also include explicit fields whose primary purpose is to force compiler-independent word and long alignment, as appropriate. These fields use little endian byte ordering. If these files are used with other processor hardware, the associated software must convert to and from the corresponding byte order. Some of the fields are described as being "offsets". In this context, an offset is a byte count, measured from the beginning of the file to the first byte of some data in the file. While this appendix sometimes specifies recommended values for these offsets, during playback the actual offset in the file should always be used. Otherwise, there might be difficulty processing existing files and future extensions to the file format. Generally, software can move the data pointed at by an offset elsewhere in the file, simply by changing the offset value. As a general rule, an offset of zero means that the associated data is not present. Version Control Strategy The data structures within this file format use a common strategy to allow controlled growth in functionality, without breaking previous functionality. This strategy is implemented by the use of three fields at the beginning of each file and the beginning of many internal data structures. These fields are a four character ASCII ID, a version number and a size in bytes. All three fields are useful, since each deals with a different kind of version mismatch, or binding mismatch problem. HdrID Field The HdrID field is used to validate that this structure is the expected kind of data structure. For the HdrID field at the beginning of the file, such a validation is essential, since the host file system allows end users to move and rename files at will. For the HdrID fields in internal data structures, validation of this field merely provides some assurance that the file data has not been corrupted. HdrSize Field The HdrSize field gives the length of the data structure in bytes, and is central to the file version control strategy. The file format is modified by adding new fields at the end of a data structure. Software that uses the latest version of the file format must properly set all fields when it creates a file. Fields that are set include those fields that are no longer needed by the latest version of file-reading software. Setting fields needed by previous versions allows older programs that have not been upgraded to the latest file format version to operate correctly within the limitations of the older file format version. In addition, since some data might not have existed when a software version was compiled, extensions to the file format have been carefully limited in ways that prevent old software from misinterpreting data. File-reading software deals with expected values in three ways: 1. If the HdrSize in the file is the expected value, this data structure has the expected format. In this case, application software can safely interpret the fields, as described in the section, "File Organization". 2. If the HdrSize in the file is less than the expected value, then this is an old format file, and is missing some expected information. Each file format version contains enough information for the level of processing that had been supported at the time the file was created. In this case, no missing information was essential to processing the file(that is, the fields missing from the file contain clearly-defined default values that can be used instead of the missing values). For example, the AvLCim.DCFId field was not part of the original file format definition. This field is set by a Digital Compression Facility (DCF) to provide information on where the video was compressed. For files that do not contain this field, as indicated by AvLCim.HdrSize (or AvLCim.HdrVer), the default value specifies that the compression site is unknown. A convenient software technique for dealing with the possibility that HdrSize is less than the expected value is to initialize a copy of the data structure with default values, and then only read in HdrSize bytes as given by the actual HdrSize field in the file. 3. If the HdrSize in the file is greater than the expected value, then this file has a format which was extended after the code was written. In this case, there are new fields that have been defined, but the application code lacks the knowledge to interpret them. For file-reading, therefore, only use the information that is described in the version of the file format definition that existed when the code was written. When an old executable is provided new format files to process, the executable might be able to play or process the new format files by ignoring fields that did not exist when the old executable was compiled. To support this processing scenario, applications always set all fields with appropriate values. These fields can be used by older versions of the software, but are ignored by the latest version of the software. A desirable file format extension might require the addition of new fields that might produce files that could not be properly played by old executables. In this case, a new type or SubType might be introduced, as discussed in the section called, "Type and Subtype Fields". Use of new Types and Subtypes makes the new data invisible to the old executable. Although in some situations data can be transcribed from an old file to a new file, even without knowing what data is represented in certain fields, it is recommended that applications totally ignore data in unknown Type or Subtype fields. Ignoring the data is the only guaranteed way to produce software that is compatible with a later file format definition. HdrVersion Field The corresponding HdrVersion field is incremented whenever a new software release adds new fields to a file data structure. Thus, either the HdrVersion or HdrSize field can be used to detect a file being read that does not conform to the current file format definition. Checking the HdrVersion field before using a MAKE utility also provides a convenient way to guard against the effects of "blind" recompilation. For example, suppose the file format has been extended to include new fields. Simply running MAKE against the new header file can produce a program that generates files with the new size data structures and corresponding version numbers, but nevertheless is invalid because of initialization problems. The problem is that the newly-defined fields must be properly initialized, which is very unlikely to occur with code that was written before these fields were defined. For example, the proper way to default the AvLCim.DCFId field is to -1. However, there is no method for software that predates this field to provide the correct default. Type And SubType Fields The AvLStrm data structure contains two fields AvLStrm.Type and AvLStrm.SubType, whose purpose is to describe the kind of data that is contained in a stream. These fields can also be used to make future extensions to the file format. These fields have a limited set of defined values. An unknown AvLStrm.Type or AvLStrm.SubType value indicates that the file format has been extended to allow the presence of data whose interpretation is totally unknown. Existing software, encountering such a value, should ignore the stream's data. File Organization The file structure organization illustrated in Figure F-1 consists of: standard file header, AvLFile header, stream headers (one per stream), substream headers (minimum one per substream), frame data, and a frame directory. ED NOTE: Figures are not available in this plain text version of the specification. Figure F-1 DVI Multimedia File Format Structures Standard File Header The first two entries in any file consist of a standard file header and an AvLFile data structure. typedef struct { U32 FileId; I16 HdrSize, HdrVersion; U32 AnnOffset; } StdFileHdr; The fields in the StdFileHdr data structure are: Type Field Name Setting/Comments U32 FileId Must be set to VSTD_HDR_ID, which equals 0x56445649 (that is, VDVI) and should be validated. I16 HdrSize Should be set to sizeof(StdFileHdr), which is 12. On playback, this field should be used as described in the section, "Version Control Strategy". Since files do exist that have this field incorrectly set, it is recommended that files with StdFileHdr.HdrVersion = 1 ignore this field, and respond as if this field is set to 12. I16 HdrVersion Must be set to VSTD_HDR_VER, which is 1. On playback, this field value should be validated. U32 AnnOffset Can be set to zero when creating a file, and can be ignored on playback. It can also be set to point to an otherwise unused portion of the file, and unstructured data placed there. This pointer could be useful for adding copyright notices to the file. AvLFile Header The AvLFile data structure always follows immediately after the StdFileHdr. typedef struct { U32 HdrID; I16 HdrSize, HdrVer, StrmGrpCnt, StrmGrpSize; U32 StrmGrpOffset; I16 StrmGrpVer, StrmSize, StrmVer, StrmCnt; U32 StrmOffset, HdrPoolOffset; I32 LabelCnt; U32 LabelOffset; I16 LabelSize, LabelVer; U32 VshOffset; U16 VshSize; I16 FrmVer; I32 FrmCnt, FrmSize; U32 FirstFrmOffset, EndOfFrmsOffset; I16 FrmHdrSize, FrmDirSize; U32 FrmDirOffset; I16 FrmDirVer, FrmsPerSec; U32 Flag; U32 FreeBlockOffset; U8 Patch[32]; } AvLFile; The AvLFile header is the master directory of data structures within the file. The fields in the AvLFile header data structure are: Type Field Name Setting/Comments U32 HdrID Must be set to AVL_FILE_ID, which is 0x41565353 (that is, "AVSS"), and should be validated on playback. I16 HdrSize Should be set to sizeof(AvLFile), which is 120. This field should be used on playback as described in the section, "Version Control Strategy". I16 HdrVer Should be set to AVL_FILE_VER, which is 3. On playback, the value in the file must be less than or equal to this value. I16 StrmGrpCnt Should be set to zero, and need not be validated on playback. I16 StrmGrpSize Should be set to sizeof(AvLStrmGrp), which is 28, and need not be validated on playback. The AvLStrmGrp data structure is not described in this appendix. U32 StrmGrpOffset Should be set to zero, and need not be validated on playback. I16 StrmGrpVer Must be set to AVL_STRMGRP_VER, which is 3, and need not be validated on playback. I16 StrmSize Must be set to sizeof(AvLStrm), which is 44. This field should be used on playback as described in the section, "Version Control Strategy" for the HdrSize field. I16 StrmVer Should be set to AVL_STRM_VER, which is 3, and should be used on playback as described in the section, "Version Control Strategy" for the HdrVersion field. Type Field Name Setting/Comments I16 StrmCnt The number of streams in the file. This value can be used as the StreamCount parameter with AvkGrpBufCreate. A stream typically consists of a set of bytes that describes a sequence of images or waveform audio samples. Each stream type is described in a separate section later in this appendix. U32 StrmOffset The offset of the array of AvLStrm structures. Usually set to sizeof(StdFileHdr) + sizeof(AvLFile) that is, the offset that points immediately after the AvLFile. Other offset values could also be used, provided the array of AvLStrm headers is placed in the file so that this offset points to it. U32 HdrPoolOffset The offset of a pool of substream headers, described in the section, "Substream Headers". This field should be set to point to this pool when a file is created. Typically, this pool begins immediately after the array of AvLStrm headers for StrmCnt, so its value could be set to AvLFile.StrmOffset + (AvLFile.StrmCnt * sizeof(AvLStrm)) This field need not be used during playback, since the first substream header for each stream can be located through AvLStrm.FirstHdrOffset. I32 LabelCnt Should be set to zero, and need not be validated on playback. U32 LabelOffset Should be set to zero, and need not be validated on playback. I16 LabelSize Should be set to sizeof(AvLLabel), which is 20, and need not be validated on playback. I16 LabelVer Should be set to AVL_LABEL_VER, which is 3, and need not be validated on playback. Type Field Name Setting/Comments U32 VshOffset The offset of the video sequence header (VSH) for this file. If none of the streams in this file require a VSH, this field and AvLFile.VshSize are zero. The VSH contains data required for the decompression of all PLV video streams in the file. The VSH data is passed to AVK through the pWorkData parameter of AvkVidStrmFormat. When creating new files under AVK with RTV, set this field to zero. AVK applications can also create files by combining audio/video data from existing or newly created files. Such file editing is legitimate under AVK, provided that no stream uses more than one compression algorithm. However, there are practical difficulties associated with generating a valid VSH. The data in the VSH depends on the actual images, and might differ from file to file, even if the files were compressed with the same PLV compression algorithm. Moreover, the file format only allows for a single VSH per file. Therefore, the merger of one or more video streams into a single file requires: * Combining the original VSHs into a new VSH * Modification of the compressed bitstreams While the PLV algorithms contain sufficient information to implement such software, the process is complex. Alternatively, the DOS media preparation utility called VAvEd can be used to create a properly merged VSH from several input video streams. See the Media Preparation Utility Reference For DOS for details on VAvEd. U16 VshSize The length of the VSH stored in the file. It is passed to AVK through the pWorkDataSize parameter of AvkVidStrmFormat. When creating a file whose only video is an RTV 2.0 or RTV 2.1 stream, this field should be set to zero. If you create a file with a validly formed VSH, then its size should be stored here. Type Field Name Setting/Comments I16 FrmVer Should be set to AVL_FRM_VER, which is 3. On playback, the value in the file must be less than or equal to this value. I32 FrmCnt Should be set to the number of AvLFrm headers in the file. Typically, this field is initialized after all the frame data has been written into the file. On playback, an application can use this field (or AvLFile.EndOfFrmsOffset) to determine when to stop delivering data to AvkGrpBufWrite. I32 FrmSize The size of a frame (frame header plus data for all streams). This field is set to zero if the frames in a file have variable length, as is typical of motion video and audio files. If, however, all frames have exactly the same length, this field contains that length. Such a file could be generated by using an optional parameter with a DOS media preparation utility called VLayout (see the Media Preparation Utility Reference For DOS for details on VLayout). Stream sizes per frame can vary, but the sum of all stream data per frame must be fixed in order for this field to be non-zero. U32 FirstFrameOffset The offset to the first frame of interleaved stream data. The interleaved data consists of a sequence of AvLFrm headers. For playback, this interleaved data should get passed to AvkGrpBufWrite, in order to play the file from the beginning. For capture, AvkGrpBufRead is used to extract the frame data from AVK, which is then formatted into AvLFrm headers for storage in the file. When creating a file, the frame data is placed towards the end of the file, and an appropriate offset stored in this field. On playback, this offset is used to locate the first frame data. To start playing the file from some other point, the appropriate first AvLFrm must be located. This location process can be done either by parsing through the AvLFrm headers, or by using data stored within the optional frame directory. Type Field Name Setting/Comments U32 EndOfFrmsOffset Must be set to the offset to the first byte after the frame data. When creating a file, its value is typically entered after the last byte of frame data has been entered into the file. On playback, no data located at or after this address should ever be passed to AvkGrpBufWrite. I16 FrmHdrSize The size of the frame header used for all frames. This field must be set to the length of the frame header, which is a value computed as sizeof(AvLFrm) + 4 * (AvLFile.StrmCnt - 1) The "-1" is needed because the AvLFrm data structure, as defined, already accounts for the presence of one stream. This field does not have to be validated when a file is played back. A better check could be implemented using the AvLFrm.ChkSum field, described in the section, "Frame Header". I16 FrmDirSize Must be set to sizeof(AvLFrmDir), which is 4. U32 FrmDirOffset The offset to the frame directory. The frame directory provides information that allows random access to an arbitrary frame within the file. It is recommended that all new files contain a frame directory, since it is very useful for random access. Some older files will, however, contain a zero for this field, meaning that the frame directory is missing. A DOS media preparation utility called VAvCopy can be used to add a frame directory to such files (see the Media Preparation Utility Reference For DOS for details on VAvCopy). Typically, the frame directory is physically placed immediately after the frame data. I16 FrmDirVer Must be set to AVL_FRMDIR_VER, which is 3. On playback, the value in the file must be less than or equal to this value. Type Field Name Setting/Comments I16 FrmsPerSec Must be set to the frame per second rate, rounded to the nearest integer written. By convention, a value of 25 means precisely 25 fps, while any other value is adjusted by the fact that NTSC is 29.97 frames per second, not 30 frames per second. If a file is based on PAL original material, but intentionally has a frame rate that is not 25 frames per second, then a pad stream must be created in order to specify a frame rate that does not have this NTSC adjustment. The use of this field during playback to help derive FrameRates to pass to AVK is described in the section, "Derived Values". U32 Flag Should be set to AVL_FILE_INP_UPDATE while a file is being created, and set to zero before the file is closed. If a file is read when this field is non-zero, the data in the file might be incomplete, and should not be used. U32 FreeBlockOffset Should be set to zero when creating a file, and need not be validated during file playback. U8 Patch[32] Should be set to all zeroes when creating a file, and need not be validated during file playback. Stream Header AvLFile.StrmOffset holds the offset to an array of AvLStrm data structures, one for each of the AvLFile.StrmCnt streams in the file. The position in the array defines the stream number. The AvLStrm data structure is: typedef struct { U32 HdrID; U16 Type, SubType; I16 HdrCnt, NextStrmNum, StrmGrpNum, Pad; U32 Flag; I32 FrmSize; U32 FirstHdrOffset; U8 StrmName[16]; } AvLStrm; This data structure describes the general nature of the data in a single stream, and points to more detailed information the substream header. The fields in the AvLStrm data structure are: Type Field Name Setting/Comments U32 HdrID Must be set to AVL_STRM_ID, which is 0x5354524d (that is, "STRM"), and need not be validated on playback. U16 Type Identifies the type of stream data. The stream type contains the following values: Type Value Description AVL_T_AUD 2 Compressed audio stream AVL_T_CIM 3 Compressed image stream AVL_T_ULAY 5 Associated per-frame data AVL_T_UIM 6 Uncompressed Image Stream AVL_T_PAD 7 Pad Stream If a stream has some other value for AvLStrm.Type, its data can and should be ignored. Type Field Name Setting/Comments U16 SubType These values depend on the value of AvLStrm.Type, and are described in the various sections on substream headers (AvLCim, AvLUlay, AvLUim, and AvLPad). I16 HdrCnt Specifies the number of substream headers associated with this stream. This field should be set to one. I16 NextStrmNum Should be set to AVL_STRMGRP_END, which is -1, and need not be validated on playback. I16 StrmGrpNum Should be set to zero, and need not be validated on playback. I16 Pad Should be set to zero, and need not be validated on playback. U32 Flag Should be set to 0x4, if the value in AvLStrm.FrmSize is variable. This field need not be validated on playback. I32 FrmSize The maximum amount of data per frame in a stream. This field does not include the frame header size. For example, for the X stream, it is the size of the frame, and for the Y stream, it is the size of the largest Y data component for that stream. When creating a file, this value could be computed while the frame data is stored into the file, and then updated into the header after all the frame data has been written. This field is useful in estimating the maximum size of a frame, as described further in the section, "Derived Values". Type Field Name Setting/Comments U32 FirstHdrOffset The offset to the stream header for this stream. The data structure at this offset must correspond to the AvLStrm.Type value. This data structure contains additional information about the stream, and will be described further in the section, "Substream Headers". U8 StrmName[16] A null-terminated ASCII string for the stream name. StrmName is not used by AVK playback, and can be set to all zeroes (which is interpreted as a null string). It is helpful, however, to set this field for use with the output of a DOS media preparation utility called VAvCheck. See the Media Preparation Utility Reference For DOS for details on VAvCheck. Substream Headers An AvLStrm data structure contains general information about a stream. Type-dependent information is stored in substream headers of the following type: AvLAud, AvLCim, AvLUlay, AvLUim and AvLPad. All the substream headers are located in a pool pointed to by AvLFile.HdrPoolOffset. The pool is located near the beginning of the file, to minimize the amount of seeking while the frame data is being processed. This is especially useful for files stored on devices like a CD-ROM that have comparatively slow seek times. AvLAud: The Audio Substream Header The AvLAud substream header describes the global characteristics of an audio stream. typedef struct { U32 HdrID; I16 HdrSize, HdrVer; U8 OrigFile[80]; I32 OrigFrm; I16 OrigStrm, Pad; I32 FrmCnt; U32 NextHdrOffset; U8 Lib[16], Alg[16]; I32 Parm1; I16 Parm2, Parm3, LeftVol, RightVol; I32 LoopOffset, StartFrm; U32 Flag; I16 Parm4, Pad2; I32 DCFId; } AvLAud; For an audio stream, AvLStrm.SubType should be set to and validated for the value zero. The fields in the AvLAud data structure are: Type Field Name Setting/Comments U32 HdrID Should be set to AVL_AUD_ID, which is 0x41554449 (that is, "AUDI"), and should be validated on playback. I16 HdrSize Should be set to sizeof(AvLAud), which is 168. This field should be used on playback as described in the section on version control strategy. I16 HdrVer Should be set to AVL_AUD_VER, which is 5. On playback, the value in the file must be less than or equal to this value. U8 OrigFile[80] Should be set to all zeroes, and need not be validated on playback. I32 OrigFrm Should be set to zero, and need not be validated. I16 OrigStrm Should be set to zero, and need not be validated. I16 Pad Should be set to zero, and need not be validated on playback. I32 FrmCnt The number of frames. U32 NextHdrOffset The offset to the next substream header for this stream. This field should be set to AVL_LAST_HDR, which is 0x7ffffff, or zero for the last header. U8 Lib[16] Should be set to all zeroes, and need not be validated. U8 Alg[16] When creating a file, this field should be set to a null-terminated text string that identifies the audio compression algorithm. On playback, this field is used to derive other quantities that are passed to AVK to control playback, as described in the section, "Derived Values". Type Field Name Setting/Comments I32 Parm1 Should be set to the audio data rate in bits per second. This value is related to the SamplesPerSecond value used with AvkAudStrmFormat, as described in the section, "Derived Values". I16 Parm2 The filter cutoff frequency to be used with the audio. This should be set to zero. I16 Parm3 Should be set to zero. I16 LeftVol Should be set to 100 on file creation. These RightVol fields are intended to enable an editor to modify the volume level associated with an audio stream. On playback, these volume level numbers should be treated as a percentage of full volume, and used to form a multiplier with the application- specified volume before that volume is passed to AvkAudStrmVolume. In a few old files, this value was set to 4096. If 4096 is found in a file, it should be treated as if it were 100. I32 LoopOffset Should be set to -1, and need not be validated on playback. I32 StartFrm Should be set to zero, and need not be validated on playback. U32 Flag Used to signify monophonic or stereo. This field is zero for mono and AVL_AUD_STEREO, which is 0x00004000, for stereo. Files might exist in which the 0x00008000 bit is set. This bit denotes an old format for adpcm4e stereo which cannot be played by AVK. I16 Parm4 Should be set to the FrameRate used with AvkAudStrmFormat when audio compression was requested. Typically, this FrameRate is the same for all streams of the file, and so this field can be set to zero. For playback, the proper way to determine the audio FrameRate is described in the section, "Derived Values". I16 Pad2 Should be set to zero, and need not be validated on playback. Type Field Name Setting/Comments I32 DCFId Should be set to -1, and need not be validated on playback. This value denotes generation on an end-user platform. A value of zero means that the Digital Compression Facility (DCF) it was generated on is unknown. A current list of DCFId's can be obtained from compression services. AvLCim: The Compressed Image And Compressed Video Substream Header The AvLCim substream header is used for compressed motion video streams and compressed still images. The compressed still images are distinguished by the use of: * Specific values of AvLStrm.SubType * Different values of DeCodeAlg The various SubTypes that can be used are: SubType Value Description AVL_ST_Y 1 Y-channel image data AVL_ST_U 11 U-channel image data AVL_ST_V 12 V-channel image data AVL_ST_YVU 13 YVU image data AVL_ST_YUV_S 14 YUV image data (Industry Standard Order) For the PLV algorithms, three streams (Y, V, U) are used to convey the information within one "logical" video stream, which explains the stream counting rules in AvkVidStrmCreate and AvkAudStrmCreate. Two SubTypes, AVL_ST_YVU and AVL_ST_YUV_S, are used to hold sequences of images, in which the entire image is contained in a single stream. (The suffix "_S" in AVL_ST_YUV_S is used only to distinguish the difference between two otherwise very similar SubType names.) These SubTypes differ only in respect to the order in which the color components are stored. All DVI video images use a YVU order for the color components, except for JPEG. AVK does not support the playing of motion video for arbitrary streams of compressed images, but only for a few explicitly identified algorithms. Such images can, however, be displayed by loading them into image buffers, using AvkImgDecompress, and using a suitable connector. typedef struct { U32 HdrID; I16 HdrSize, HdrVer; U8 OrigFile[80]; I32 OrigFrm; I16 OrigStrm, Pad; I32 FrmCnt; U32 NextHdrOffset; I16 XPos, YPos, XLen, YLen; I16 XCrop, YCrop, DropFrm, DropPhase; I32 StillPeriod; I16 BufsMin, BufsMax, DeCodeAlg, Pad2; I32 DCFId; } AvLCim; This substream data structure can be used to store several different kinds of compressed images. The fields in the AvLCim data structure are: Type Field Name Setting/Comments U32 HdrID Should be set to AVL_CIM_ID, which is 0x43494d47 (that is, "CIMG"), and should be validated on playback. I16 HdrSize Should be set to sizeof(AvLCim), which is 136. This field should be used on playback as described in the section, "Version Control Strategy". I16 HdrVer Should be set to AVL_CIM_VER, which is 4. On playback, the value in the file must be less than or equal to this value. U8 OrigFile[80] Should be set to zero, and need not be validated at playback. Type Field Name Setting/Comments I32 OrigFrm Should be set to zero, and need not be validated at playback. I16 OrigStrm Should be set to zero, and need not be validated at playback. I16 Pad Should be set to zero, and need not be validated at playback. I32 FrmCnt The number of frames until the next substream header applies. This field should be AVL_LAST_HDR, which is 0x7fffffff. U32 NextHdrOffset The offset to the next substream header for this stream. This field should be set to zero. I16 XPos Should be set to zero, and need not be validated YPos on playback. I16 XLen Specifies the maximum width and height for the YLen decompressed images in this file. An AVK application should set these fields properly, and validate that these fields will not overflow the limits imposed by the Xres and Yres passed to the AvkVidStrmFormat call. When using AvkImgDecompress, AVK should report an error if the destination image is too small. However, when playing motion video, the microcode might not detect an attempt to decompress an image into a bitmap too small to hold it, causing unpredictable results. I16 XCrop Should be set to zero, and need not be validated YCrop on playback. I16 DropFrm Should be set to zero, and need not be validated DropPhase on playback. Type Field Name Setting/Comments I32 StillPeriod Indicates that the video was compressed such that every Nth frame was intraframe encoded. For example, if every image of the stream is a still image this field should be one. If this value is three, then frames numbered "0, 3, 6, 9, 12, ..." are all still images. If this value is one, then every image is a still frame. In addition to these intraframe coded images, the stream might contain additional intraframe encoded images. The default value of AVL_CIM_RANDOM_STILL, which is -1, indicates that intraframe image spacing is unspecified. I16 BufsMin Should be set to zero. BufsMax I16 DecodeAlg Should be set to the decompression algorithm. On playback, the AvLCim.DecodeAlg value should be passed to AvkVidStrmFormat or AvkImgDecompress. I16 Pad2 Should be set to zero, and need not be validated on playback. I32 DCFId Operates the same as AvlAud.DCFId. This field should be set to -1, and need not be validated on playback. AvLUlay: The Underlay Substream Header Underlay streams hold digital data associated with the same interval of time as the other streams that are present in each frame. Generally, each SubType can have its own underlay substream header definition. However, for many kinds of underlay data, the following generic underlay substream header can be used. typedef struct { U32 HdrID; I16 HdrSize, HdrVer; U8 OrigFile[80]; I32 OrigFrm; I16 OrigStrm, Pad; I32 FrmCnt; U32 NextHdrOffset; I32 DCFId; } AvLUlay; Only one underlay SubType is supported, to be used for holding SMPTE timecodes (see the section, "SMPTE Timecode Underlay Streams" for details). In addition, a range from zero through 32767 has been reserved for possible registration of specific, to-be-determined, well-defined uses. SubTypes greater than 32767 will not be controlled and can be freely used for application-specific purposes. In a multiple stream file, several underlay streams can exist with the same SubType, each associated with another interleaved stream. In this case, by convention, the data in the underlay stream applies to the closest preceding stream of an appropriate type. The fields in the AvLUlay data structure are: Type Field Name Setting/Comments U32 HdrID Should be set to AVL_ULAY_ID, which is 0x554e4452 (that is, "UNDR"), and need not be validated on playback. I16 HdrSize Should be set to sizeof(AvLUlay), which is 108. This field should be used on playback, as described in the section, "Version Control Strategy". I16 HdrVer Should be set to AVL_ULAY_VER, which is 4. On playback, the value in the file must be less than or equal to this value. U8 OrigFile[80] Should be set to zero, and need not be validated at playback. I32 OrigFrm Should be set to zero, and need not be validated at playback. I16 OrigStrm Should be set to zero, and need not be validated at playback. I16 Pad Should be set to zero, and need not be validated at playback. I32 FrmCnt The number of frames until the next substream header applies. This field should be AVL_LAST_HDR, which is 0x7fffffff. U32 NextHdrOffset The offset to the next substream header for this stream. This field should be set to zero. I32 DCFId Operates the same as AvlAud.DCFId. This field should be set to -1, and need not be validated on playback. AvLUim: The Uncompressed Image Substream Header The AvLUim data structure is used to hold uncompressed images. typedef struct { U32 HdrID; I16 HdrSize, HdrVer; U8 OrigFile[80]; I32 OrigFrm; I16 OrigStrm, Pad; I32 FrmCnt; U32 NextHdrOffset; I16 XPos, YPos, XLen, YLen, PixBits, Pad2; I32 DCFId; } AvLUim; The fields in the AvLUim data structure are: Type Field Name Setting/Comments U32 HdrID Should be set to AVL_UIM_ID, which is 0x55494d47 (that is, "UIMG"), and need not be validated on playback. I16 HdrSize Should be set to sizeof(AvLUim), which is 124. This field should be used on playback, as described in the section, "Version Control Strategy". I16 HdrVer Should be set to AVL_UIM_VER, which is 4. On playback, the value in the file must be less than or equal to this value. U8 OrigFile Should be set to zero, and need not be validated at playback. I32 OrigFrm Should be set to zero, and need not be validated at playback. Type Field Name Setting/Comments I16 OrigStrm Should be set to zero, and need not be validated at playback. I16 Pad Should be set to zero, and need not be validated at playback. I32 FrmCnt The number of frames until the next substream header applies. This field should be AVL_LAST_HDR, which is 0x7fffffff. U32 NextHdrOffset The offset to the next substream header for this stream. This field should be set to zero. I16 XPos Should be set to zero, and need not be validated YPos on playback. I16 XLen Specifies the maximum width and height for the YLen decompressed images in this file. I16 PixBits Should be set to the average number of bits per pixel in the image. Typical values for this field are 8, 9, 16 and 24. On playback, AvLUim.PixBits should be used to determine the value of BitmapFormat to pass on a call of AvkImgCreate. I16 Pad2 Should be ignored and set to zero. I32 DCFId Operates the same as AvLAud.DCFId. This field should be set to -1, and need not be validated on playback. AvLPad: The Pad Substream Header Pad streams files are generated by a DOS media preparation utility called VLayout. See the Media Preparation Utility Reference For DOS for details on VLayout. A pad steam header can tore a frame rate more accurately than can be done by using AvLFile.FrmsPerSec. VLayout generates pad data in the various AvLFrm headers, such that the average rate of data consumption precisely matches the standard data rate from a CD-ROM (153,600 bytes per second). For VLayout, this pad data is set to all zero. On playback, most of the data in pad streams can be ignored on playback. However, any existing pad streams should be used to derive accurate FrameRates for all streams during playback, as described in the section, "Derived Values". typedef struct { U32 HdrID; I16 HdrSize, HdrVer; U8 OrigFile[80]; I16 OrigStrm, Pad; I32 FrmCnt; U32 NextHdrOffset; I32 ImagesPer, Seconds, VidFast, VidVar, VidRev, VidStart; I16 UlayFast, UlayVar, UlayRev, UlayStart; I16 PipeDepth, PipeStart, MinSeek, MinPad; I32 DCFId; } AvLPad; The fields of the AvLPad data structure are: Type Field Name Setting/Comments U32 HdrID Should be set to AVL_PAD_ID, which is 0x50414421 (that is, "PAD!"), and should be validated on playback. I16 HdrSize Should be set to sizeof(AvLPad), which is 144. This field should be used on playback as described in the section, "Version Control Strategy". Type Field Name Setting/Comments I16 HdrVer Should be set to AVL_PAD_VER, which is 4. On playback, the value in the file must be less than or equal to this value. U8 OrigFile[80] Should be set to zero, and need not be validated at playback. I16 OrigStrm Should be set to zero, and need not be validated at playback. I16 Pad Should be set to zero, and need not be validated at playback. I32 FrmCnt The number of frames until the next substream header applies. This field should be AVL_LAST_HDR, which is 0x7fffffff. U32 NextHdrOffset The offset to the next substream header for this stream. This field should be set to zero. I32 ImagesPer Two 32-bit integers whose ratio is the Seconds frame rate in images per second. These fields should be set to zero, and need not be validated on playback On playback, these fields are used to derive the frame rate for all streams in the file, as described in the section, "Derived Values". I32 VidFast Should be set to zero, and need not be validated on playback. I32 VidVar Should be set to zero, and need not be validated on playback. I32 VidRev Should be set to zero, and need not be validated on playback. I32 VidStart Should be set to zero, and need not be validated on playback. I16 UlayFast Should be set to zero, and need not be validated on playback. I16 UlayVar Should be set to zero, and need not be validated on playback. I16 UlayRev Should be set to zero, and need not be validated on playback. Type Field Name Setting/Comments I16 UlayStart Should be set to zero, and need not be validated on playback. I16 PipeDepth Should be set to zero, and need not be validated on playback. I16 PipeStart Should be set to zero, and need not be validated on playback. I16 MinSeek Should be set to zero, and need not be validated on playback. I16 MinPad Should be set to zero, and need not be validated on playback. I32 DCFId Operates the same as AvLAud.DCFId. This field should be set to -1, and need not be validated on playback. SMPTE Timecode Underlay Streams If the SubType of an underlay stream is AVL_ST_TIMECODE (which is 1), the stream contains SMPTE timecode data. The data in each frame consists of four bytes which are the Binary- Coded-Decimal representation of the HH:MM:SS:FF for that frame, as defined by the SMPTE standard for time codes. Generally, several interleaved audio and/or video streams can exist in a file, each with its own timecode data. The rule for associating a timecode stream with audio or video data is that the timecode stream refers to the immediately preceding audio or video stream. For example, a file compressed by compression services might have the following six stream types: Y, V, U, Timecode, Audio, Timecode. This stream order indicates that there is valid (and possibly equal) timecode information for both the video and audio data. Typically, the data to fill this stream is extracted from a time code reader at the same time as the original video and audio are digitized. Frame Data Each frame of data in a DVI Multimedia file is preceded by a frame header, identifying the amount of data per stream. Frame Header This data structure is used to introduce the actual data of the file. All the other headers only describe this data. The data consists of a sequence of contiguous AvLFrm header/data pairs, one for each frame of the file. See Appendix C, "Algorithm Characteristics", for details on interpreting data during playback or capture. typedef struct { I32 FrmNum, RevOffset, ChkSum; //I32 StrmFrmSize[AvLFile.StrmCnt] This is invalid C syntax. I32 StrmFrmSize[1]; //Note, This line has valid C syntax, but has wrong array size } AvLFrm; The fields of the AvLFrm data structure are: Type Field Name Setting/Comments I32 FrmNum The sequential frame number in each file, starting with zero, allowing several files to be opened and fed in sequence to AVK. This field is not used by AVK playback, in order to allow convenient concatenation of the data from several files. This field is generated by AVK capture, for use in identifying the precise time that each frame's data occurred. I32 RevOffset The file offset to the previous AvLFrm in the file (measured from the beginning of the file). For the first frame of a file, this file offset is zero. This field must be properly generated on file creation. While AvLFrm.RevOffset is typically ignored on AVK playback, this offset is used by some DOS media preparation utilities provided with ActionMedia II software. I32 ChkSum Provides an efficient check of whether or not a given block of data begins with a valid AvLFrm. Its value is formed by exclusive ORing all AvLFile.FrmHdrSize 32 bit words in the frame header (excluding this one) with the constant AVL_FRM_ID, which is 0x46524d48 (that is, "FRMH"). This field must be computed by the application before storing the data received by AvkGrpBufRead into a file. This field can be validated by AVK when the data is passed to AvkGrpBufWrite. Type Field Name Setting/Comments I32 StrmFrmSize Consists of one long word for each of the AvLFile.StrmCnt streams in the file. This field contains the byte count for the actual data within each stream of the file. This frame data immediately follows the AvLFrm, with no padding between the frame data for successive streams. By convention, there are some special AvLFrm.StrmFrmSize values, that can be used with audio streams and compressed image streams. The size values zero, eight and sixteen indicate that no data exists for this frame. These byte values have slightly different interpretations by AVK playback, namely missing, silent audio and transparent video frames, respectively. The value eight is only used for audio streams, and denotes one frame time's worth of silence. The associated eight bytes of actual data, may be set to zero. The value sixteen is used only for compressed image streams, and denotes one frame time in which the image on the display does not change. The associated 16 bytes of data must be a valid compressed image bitstream header. This header is described in the section, "Compressed Image Bitstream Header". In this context, the compressed image bitstream header consists of eight words with the values: AvLCim.DecodeAlg, 0, 128, 0, 0, 0, image_height_in_pixels, image_width_in_pixels. A byte count of zero provides physical spacing of the frame data, but assumes that some other mechanism will be used to control the temporal spacing. Compressed Image BitStream Header For compressed image streams, the actual data for each frame begins with a 16-byte bitstream header. For PLV sequences, the image data is contained in three separate streams that do not necessarily have the same values for these fields. This bitstream header must be present for all compressed images. If the compression is imported from another system, such as a JPEG image, then a suitable bitstream header must be synthesized and pre-pended. While in theory, a third party JPEG-conformant decompression processor should ignore this header, it is probably a good idea to strip this header before exporting JPEG images. typedef struct { U16 AlgNum; U16 Flags; U32 NumBits; U32 AlgSpec; U16 YSize; U16 XSize; } AvLBsh; The fields of the AvLBsh data structure are: Type Field Name Setting/Comments U16 AlgNum Contains the AlgName (as passed to AvkVidStrmFormat) used to compress the image. Type Field Name Setting/Comments U16 Flags Contains information that might be used by editor programs and for random access. This flag word contains various bits that might be useful for making decisions about individual frames. All bits and bit combinations not explicitly described in the following questions are reserved for future use, and should be masked away before making any of the following decisions. * Is this image intraframe encoded (that is, a still frame)? The status of an image intraframe can always be determined by examining 0x4. If this bit is set, this image can be decompressed without reference to any other image. The first frame of a file should be a still frame. * Can this image be used as the last image of a self-contained edited subsequence? For simple compression algorithms, the answer to this question is always yes. However, in the class of algorithms not supported by AVK, the compressed data in the bitstream might not be used until a subsequent image has also been decompressed. To write code that will also work for such compression algorithms, editing programs should determine this image use as follows: - The answer is yes if bit 0x80 is zero. - The answer is also yes if bit 0x80 is one and bit 0x40 is one. * Can this image be replaced by a transparent image, with no effect on any other images? If bit 0x80 is zero, bits 0x300 contain a two bit count of the number of images until the next reference frame. That is, if the 0x300 bits are 01, this image might be discarded. If they are 10, this image and the next image might be discarded If they are 11, this image and the next two images might be discarded. If the 0x300 bits are 00, there is no information about the distance to the next intracoded image. In addition, if AvLCim.StillPeriod is not AVL_CIM_RANDOM_STILL, AvLCim.StillPeriod can be used to predict the distance to the next still image. If bit 0x80 is one, an image can be discarded if the 0x700 bits are 000. Type Field Name Setting/Comments U32 NumBits Contains the number of bits in the image, including this header. For historical reasons, RTV 1.0 and RTV 1.5 contain a byte count instead. U32 AlgSpec Contains information related to the use of the VSH data with this image. For compression algorithms that do not require a VSH for decompression, this field will be zero. U16 YSize The height and width (respectively) of the image XSize in pixels. For the subsampled U and V PLV streams, these fields describe the height and width of the subsampled chrominance bitmap. Compressed Audio Bitstream Header The frame data for an audio bitstream has an internal structure. Knowledge of this structure is useful for conversion between audio bitstream formats, for editing files containing audio. typedef struct { I16 Word1; I16 Word2; I16 Word3; I16 Word4; } This header is generated automatically by AVK when it digitizes and compresses. However, if a file is edited, or a bitstream is converted from another source, the following information is needed to generate a valid frame. The fields of the compressed audio bitstream header data structure are: Type Field Name Setting/Comments I16 Word1 The number of words in this frame, not counting the four words of this header. I16 Word2, byte1 The audio algorithm, which is encoded as follows: adpcm4e-mono 1 adpcm4e-stereo 3 pcm8-mono 5 pcm8-stereo 7 Word2, byte2 This field of a four-word header is always 0xFF. An audio stream that contains some other value implies that the audio bitstream did not have a four-byte header. Some adpcm4e files exist that do not have this header. The format of audio streams that are missing the audio bitstream header are not described in this appendix. AVK will automatically detect the absence of the audio bitstream header, and apply appropriate defaults in order to play those bitstreams. I16 Word3 This field contains the sample rate in samples per second. I16 Word4 This field should be set to zero. The remaining audio data in a frame consists of a concatenated sequence of subframes. A monophonic subframe consists of precisely 32 bytes, while a stereo subframe consists of precisely 64 bytes. Frames contain an integral number of subframes. Ideally, for a given algorithm and sample rate, the number of subframes per frame would be a constant, generated by a simple formula (bits-per-second divided by bits-per-subframe). In practice, this formula yields a non-integral value. To deal with this non-integral value, "occasionally" a frame will have an extra subframe so that audio bit rate will average to the right value. When converting audio data from another file format to play under AVK, these extra subframes must be inserted, because AVK uses this average bit rate to maintain lip-synch. An audio stream can be edited by splicing together subframes. After editing: * A valid 4-word audio bitstream header must be generated at the beginning of each frame. * The number of subframes per frame must be adjusted so that the average bit rate is correct. * In addition, when splicing adpcm4e subframes, the first 16 bits of data for the first subframe of a "cut" must be zero. A "cut" is the beginning of a fragment of continuous audio. The data in the subframes depends on the audio algorithm. The adpcm4e bitstream format inside a subframe is not documented in this appendix, but is available. The pcm8 bitstream format is very simple. A mono bitstream consists of a sequence of 8-bit values representing the instantaneous volume level. For a stereo bitstream, the data consists of a sequence of two-byte values, each of which represents the instantaneous volume to the left and right speakers respectively. Frame Directory The frame directory is useful for random access to a file, and consists of one AvLFrmDir header for each AvLFrm header of the file. This data is typically (but not necessarily) written towards the end of the file, after all the AvLFrm header data has been entered. Since this data structure has a fixed length, and is present for every frame, it is possible to compute the location of a specific AvLFrmDir header in the file, and seek directly to it. typedef struct { U32 FrmOffset; } AvLFrmDir; The 31 least significant bits of AvLFrmDir.FrmOffset is the offset to the associated AvLFrm header in the file. The high order bit is 1, if this frame can be used for a random access to the frame data of every stream in the file. The first frame of a file must be usable for random access. This bit can be set by using information about each compressed image, as described in the section, "Compressed Image Bitstream Header". Theoretically, an audio stream is only suitable for random access if it has the same FrameRate as the video streams with which it is interleaved. However, in practice, all the audio algorithms supported will quickly resynchronize, even if started in the middle of an audio frame. A frame directory can also be generated through use of an optional parameter to the DOS media preparation utility VAvCopy. See the Media Preparation Utility Reference For DOS for details on VAvCopy. Derived Values The preceding sections describe the values explicitly encoded into the current file format. This section summarizes the rules for deriving the values of several quantities not explicitly present in a file. AVK Frame Rate In AVK, the frame rate received is designated as microseconds per frames. However, since the DVI multimedia file format designates frame rate as frames per second, an application must translate the designation. When audio is combined with video in a file with interleaved streams, each chunk of audio and video data should have the same playing time. Therefore, explicitly chunking the audio data so that it has the same frame rate as the video allows cutting and pasting of interleaved audio/video data to form edited files. A DOS media preparation utility such as VAvEd can be used for this cutting and pasting process. See the Media Preparation Utility Reference For DOS for details on VAvEd. For AVK, the frame rate of the file is also used to control the synchronization of audio and video data. To synchronize its play against a wall clock, each stream processed uses data that is structured into frames with a known playback duration. PAL original material is also fully supported in AVK. Therefore, the following rules apply for determining the frame rate of the streams of an existing file: * First, determine if the file has a pad stream. If a pad stream exists, the pad stream should be used to compute the frame rates of all streams in the file. To determine whether or not a file has a pad stream, examine the AvLStrm data structures, looking for a structure whose AvLStrm.Type is AVL_T_PAD (that is, 7). * In rare cases, a file can contain several pad streams, in which case the last pad stream should be used for frame rate calculations. Since the AvLStrm header data is stored as an array of fixed length structures, it is straight forward to search all AvLStrm header data in reverse order. * After identifying the pad stream, access AvLPad.ImagesPer and AvLPad.Seconds. These values can be converted to a frame rate in AVK units (microseconds per frame), using the formula: FrameRate = Round ((1,000,000 * AvLPad.Seconds)/AvLPad.ImagesPer)) * If the file does not contain a pad stream, the frame rate of video steams is found by examining AvLFile.FrmsPerSec. This value is stored as an integer. To convert it to a frame rate, use the formula: FrameRate = Round ((1,000,000/AvLFile.FrmsPerSec) * NTSC_ADJUSTMENT) The symbol NTSC_ADJUSTMENT is 1001/1000 or 1. The value "1" is only used when AvLFile.FrmsPerSec is exactly 25. * If the file does not contain a pad stream, the frame rate for an audio stream can be determined by examining AvLAud.Parm4. If AvLAud.Parm4 is not present (as indicated by the value in AvLAud.HdrSize or AvLAud.HdrVer), or if its value is zero, the audio frame rate is derived from AvLFile.FrmsPerSec, as described. If AvLAud.Parm4 is nonzero, the audio frame rate is derived by substituting AvLAud.Parm4 for AvLFile.FrmsPerSec in the formula above. When a file is created, all its streams should have the same frame rate. If this frame rate can be accurately represented via AvLFile.FrmsPerSec, there is no need to include a pad stream. However, if the frame rate cannot be regenerated by the above calculation, the file should be created with a pad stream so that an accurate frame rate can be stored within it. Maximum Frame Size And Group Buffer Size When creating a group buffer for AVK, it is important to know the largest size of a frame that might be found in the file. While this information is not explicitly recorded in a file, a reasonable upper limit can be estimated by using the AvLStrm.FrmSize values in each AvLStrm. This upper bound is the sum of the AvLStrm.FrmSize fields from every stream of the file, plus the length of a frame header (that